This is the documentation for the HIPS 2022 & 2023 Dataset.
Each of the following variables includes notes on how the trait was collected and the original data file the data was taken from and any specific notes regarding errors or data cleaning done. Variables are listed by the name used in the final version of the data set. For all quantitative variables, observations an order of magnitude larger than other observations, those appearing as extreme outliers in histograms, and those appearing outside the primary bounds of the distribution for the location were dropped from the data set.
location.
This is information for the Ames and Crawfordsville 2022 locations was
determined based on experiment code, which was dependent on the nitrogen
level and the field maps.nitrogenTreatment and location the plot
belongs to. For the Missouri Valley, Ames, and Crawfordsville location
in 2022, this information was taken from the ‘Basic info’ sheet in the
file ‘YTMC_Lisa_Plot_Coordinates_v4.xlsx’. For the Missouri Valley,
Ames, and Crawfordsville locations in 2023, this information was taken
from the ‘Basic info’ sheet in the file ‘2023_yield_ICIA_v3.xlsx’.sublocation. These values were assigned based on
the location, sublocation, and rep number [not
included] associated with a plot.block
number assigned was also odd. The same convention was maintained for
even rep numbers.block column in the dataset.sublocation the plot was located in.sublocation the plot was located in.qrCode were transformed
to the plot number assigned to that row and range in the original field
maps. In Missouri Valley 2022, 100 was added to hybrid plot numbers in
block 1 and 200 was added to hybrid plot numbers in
block 2 to de-duplicate the plot numbers within the
location. In Missouri Valley 2022, 400 was added to the
inbred plot numbers in block 2 to de-duplicate the plot
numbers within the location. In the Missouri Valley 2023
inbreds, 400 was added to inbred plot numbers in block 2 to
de-duplicate the plot numbers.qrCode were
changed to their full version, genotype names of the form ‘COMMERCIAL
HYBRID X’ were converted to the market name of the commercial hybrid,
‘FILLER’ was converted to the actual genotype name when known or NA if
unknown. Additionally, typos in the genotype names ‘4N506 X 3IIH!6’ and
‘PHP02 X PHJ894’ were corrected to ‘4N506 X 3IIH6’ and ‘PHP02 X PHJ89’,
respectively.anthesisDate for plot 146 in North Platte was in December
1901, so it was removed as a data entry error.silkDate for plot 349 in North Platte was in January 1900,
so it was removed as a data entry error.plantingDate and anthesisDate.anthesisDate for plot 146 in North Platte was in December
1901, so it was removed as a data entry error. Furthermore, all values
less than 50 days were visual outliers and were converted to missing
values. In the 2023 inbreds, all values less than 42.5 days were visual
outliers and converted to missing values.plantingDate and silkDate.silkDate for plot 349 in North Platte was in January 1900,
so it was removed as a data entry error.silkDate and anthesisDate for a plot.anthesisDate or silkDate and so
anthesisSilkingInterval was not calculated. Furthermore,
all values greater than 15 days were visual outliers and were converted
to missing values. In the 2023 inbreds, values less than -15 or greater
than 25 days were visual outliers and converted to missing values.plantingDate and
anthesisDate for the plot.plantingDate and
silkDate for the plot.anthesisDate and
silkDate for the plot.totalStandCount) scaled to be equivalent to 1/1000th of an
acre (a plotLength of 17.5 feet with an assumed spacing of
30”) and multiplied by 1000. In the 2023 hybrids, values less than
20,000 or greater than 50,000 were visual outliers and were converted to
missing values.combineYield, combineMoisture, and
plotLength values for the plot. This calculation assumes a
30” row spacing. In Lincoln 2023, some plots accidentally had ears from
the center row harvested during hand harvesting. When this was noted,
these values were converted to missing data. Plots 1094, 1095, 1265, and
1435 in Scottsbluff were dropped as they were damaged due to solar panel
removal.shelledCobWidth was greater than earWidth were
dropped. In the 2022 inbreds, Lincoln plot 1041 was calculated as the
mean of 5 ears rather than 6 as the remaining observation appeared to be
a data entry error. Furthermore, values in Crawfordsville less than 1
cm, values in Lincoln less than 0.75 or greater than 3 cm, and values in
Scottsbluff less than 1.25 or greater than 3.5 cm were visual outliers
and were converted to missing values. In the 2023 hybrids, all values
less than 2 or greater than 3.25 cm were visual outliers and converted
to missing values. In the 2023 inbreds, values in Lincoln less than 1.75
cm were visual outliers and converted to missing values.kernelMassPerEar divided by kernelsPerEar
multiplied by one hundred. Plots 5122 and 5245 at Lincoln, and 835, 838,
1169, and 1211 at North Platte were calculated using 3 ears rather than
4, as the remaining observation was an order of magnitude larger than
the others. In the 2022 inbreds, values greater than 50 grams, values in
Crawfordsville less than 10 or greater than 37.5 grams, values in
Lincoln greater than 31 grams, and values in Scottsbluff greater than 32
grams were visual outliers and were converted to missing values. In the
2023 hybrids, plots 23-C-1746410, 23-C-1746525, and 23-C-1746805 were
calculated as the mean of 2 ears, and plot 23-C-1746706 was calculated
as the mean of 1 ear, rather than 3 ears, as the remaining
observation(s) appeared to be data entry errors. Furthermore, all values
less than 15 or greater than 45 grams were visual outliers and converted
to missing values. In the 2023 inbreds, values in Lincoln greater than
35 grams were visual outliers and converted to missing values.shelledCobMass. In the cases that significant spillage was
denoted for plots from the Ames and Crawfordsville locations, the same
estimation used at Lincoln was used in place of the direct measurement.
Plot 256 at Missouri Valley and plot 664 at North Platte were calculated
using 3 ears rather than 4 as the remaining observation was an order of
magnitude larger than the other observations. In Ames and
Crawfordsville, some ears’ kernels were re-weighed due to an off-balance
scale. In this case, the re-weighing values replaced the original
values. In the 2022 inbreds, values in Ames greater than 155 grams,
values in Crawfordsville greater than 150 grams, values in Lincoln
greater than 100 grams, values in Missouri Valley greater than 135
grams, and values n Scottsbluff greater than 148 were visual outlers and
were converted to missing values. In the 2023 hybrids, values in
Crawfordsville greater than 275 grams were visual outliers and were
converted to missing values. In the 2023 inbreds, values in Missouri
Valley greater than 150 grams and values in Lincoln greater than 100
grams were visual outliers and were converted to missing values.totalStandCount and multiplied by
100.The alley length in both the Inbred HIPS and Hybrid HIPS fields was 2.5 feet, and the distance between seeds in a row was 6 inches. Nitrogen application was made on March 29, 2022 at the treatment-specified rates (Low: 75 lbs/acre, Medium: 150 lbs/acre, High: 225 lbs/acre) with urea ammonium nitrate (32-0-0) Conventional tillage was done prior to planting.
Plots in the Inbred HIPS field were 10 feet long center to center, including 7.5 feet of plants and 2.5 feet of alley. The planting date was May 5, 2022. Plots were hand-harvested (4 ears per plot) on October 8, 2022. The previous crop in the field was soybeans. The GPS coordinates for the field corners were:
NE corner: 40°51’32.96”N, 96°35’50.40”W
SW corner: 40°51’31.39”N, 96°35’54.10”W
NW Corner: 40°51’32.93”N, 96°35’54.10”W
SE Corner: 40°51’31.44”N, 96°35’50.36”W
Plots were two-row plots, and the field contained 16 plots (160 feet) north to south and 58 plots (290 feet) east to west.
Plots in the Hybrid HIPS field were 20 feet long center to center, including 17.5 feet of plants and 2.5 feet of alley. The planting date was May 22, 2022. Plots were hand-harvested (4 ears per plot) on October 1, 2022 and combine harvested on October 10, 2022. The previous crop in the field was maize. The GPS coordinates for the field corners were:
NE corner: 40°51’8.59”N, 96°36’50.26”W
SW corner: 40°51’7.25”N, 96°37’0.54”W
NW Corner: 40°51’8.70”N, 96°37’0.52”W
SE Corner: 40°51’7.12”N, 96°36’50.29”W
The field was 150 feet north to south and 800 feet east to west. Differential weed pressure existed throughout the field according to James.
175 lbs of urea was applied on both the hybrid and inbred fields in this location. The nitrogen fertilizer was applied on June 6, 2022. The field latitude is 41.671747 N and the longitude is -95.943982 W. The fields were not irrigated. Planting was completed on April 29, 2022 and the harvest was completed on October 11, 2022. The previous crop for was corn.
There were 752 plots (376 plots per replication) in this field. The plots were two-row plots. The plot numbers shown in the image correspond to those used in the QR codes.
There were 176 plots (88 plots per replication) in this field. The
plots were four-row plots.
The plot numbers shown in the image correspond to those used in the QR codes. The QR codes mis-assigned the replicate numbers. These have been fixed in the dataset.
Only the Hybrid HIPS population was grown at this location. Plot numbers are unique across the North Platte location, but range and row numbers were duplicated in each field. Each field is a different irrigation treatment. To account for this, the location was split in 3 by irrigation treatment in the data. North Platte1 is the full irrigation field; North Platte2 is the partial irrigation field, and North Platte3 is the dryland, i.e. rainfed, field. The plots were 4-row, 20 foot plots, with 17 feet of the plot planted and 3 feet of alley. Nitrogen treatments were blocked within each irrigation treatment. The full irrigation field was planted on May 17, 2022, and harvested on October 21, 2022 and November 1, 2022 with planted plot lengths of 17.5 feet. The partial irrigation field was planted on May 17, 2022 and harvested October 26 - 28, 2022 with planted plot lengths of 17.5 feet. The dryland field was planted on May 18, 2022 and harvested on October 19-21 and 24, 2022 with planted plot lengths of 17 feet. The previous crop was soybeans. Nitrogen was applied on June 16, 2022 as 32-0-0 with a 360 Y drop applicator.
In the image above, the blue rectangle is the approximate location of the full irrigation field, the yellow rectangle is the approximate location of the partial irrigation field, and the red rectangle is the approximate location of the dryland field.
The irrigation amounts and timing for the full and partial fields is as follows:
The field was irrigated for one hour from 4:00 p.m. to 5:00 p.m. every Friday evening, for a total of 16.86 inches of irrigation provided over the growing season.. The row spacing was 30”. In-field measurements (flowering time, height data, and combine yield measurements) use the range-row system defined in the sheet ‘Layout (Original)’ of the files ‘Scottsbluff Hybrid HIPS - Summary.xlsx’ and ‘Scottsbluff Inbred HIPS - Summary.xlsx’. In the data, these range-row assignments were dropped in favor of using the range-row assignments listed in the QR codes and used by the ear phenotype and NIR grain composition measurements, which better capture the spatial distance between plots where there are border plots. The geographic location within the field of a plot by its plot number is the same in both layouts, but the range-row numbering is different. The range-row system used in the QR codes and the data is depicted in the sheet ‘Layout (Modified)’ of the files ‘Scottsbluff Hybrid HIPS - Summary’ and ‘Scottsbluff Inbred HIPS - Summary.xlsx’, and shown in the image below. ‘Fill’ is equivalent to ‘Border’ at other locations. The field was planted north to south and together, the inbred and hybrid fields are 625 feet north to south. The fields were planted on May 19, 2022. A combination of 10-34-0 (NPK) liquid fertilizer and urea were applied to meet the nitrogen treatment level requirements, as shown below. Urea applications were made on July 8, 2022. Both field layouts denote that planting of the hybrid plots started in the southeast corner of the field; however, the correlations between two replicates of a genotype within a treatment indicate that planting started in the southwest corner of the field. The inbreds were still located directly to the west of the hybrids. In the corrected layout, plot 1001 is in the SW corner of the field, plot 1025 is in the NW corner of the field, the NE corner of the field is a fill plot, and plot 1491 is in the SE corner of the field. The previous crop was dry beans.
The plots were 10 foot (7.5 feet planted), 2-row plots. This field was very weedy according to Ramesh.
Based on yield data and grain protein content, it was determined that the labels in the QR codes for the high and low nitrogen treatments were reversed. The QR codes in the data reflect the content of the original QR codes, and the nitrogen treatment variable reflects the actual level of nitrogen the plot received. The plots were 25 foot (22.5 feet planted), 4-row plots and the middle two rows were harvested.
These fields were planted on May 11, 2022 and harvested on October 7, 2022. Both the inbred and hybrid fields had 3 nitrogen treatments (High, Medium, and Low). The field coordinates were 41.199066, -91.486991. The nitrogen treatments were applied on June 2, 2022 using 32% UAN.
Low nitrogen is to the west in these maps. The previous crop was soybeans.
The previous crop was soybeans.
These fields were planted on May 23, 2022 and harvested on October 16, 2022. Both the inbred and hybrid fields had 3 nitrogen treatments (High, Medium, and Low). The High and Medium nitrogen treatments for both hybrids and inbreds were located in the B1 field. The Low nitrogen treatments for both hybrids and inbreds were located in the E1 field.
This field was planted on May 22, 2022. Nitrogen was applied using urea and 32% UAN on May 17, 2022, and June 1, 2022. The field coordinates were 42.015354, -93.732519. The previous crop was corn.
This field was planted on May 23, 2022. Nitrogen was applied using urea on May 21, 2022. The field coordinates were 42.012376, -93.737301. The previous crop was soybeans.
Plots were 4 row plots with 30” row spacing and were 20 feet center to center, including 17.5 feet of plants and 2.5 feet of alley. Nitrogen was applied on 6/15/23, 150 lb/a N as 32-0-0 surface applied with Y-Drop applicator. The previous crop was was soybean. The field had 8 underground irrigation zones randomized within each of 4 blocks. 5 of the zones were set to irrigate with 4.5 inches of irrigation over the growing season, and 3 of the zones applied 0 inches of irrigation. The field coordinates were 41.086705°, -100.775034° and plots were planted on May 10, 2023. Hand harvesting was done on October 11, 2023, and mechanical harvest was done on October 19, 2023.
There was significant hail damage the night of July 22-23, during the middle of the tasseling period.
All plots had 30 inch row spacing and 2.5 ft alleys. The field coordinates were 40.859683°, -96.596310°. The previous crop was soybeans.
The field was planted on May 16, 2023. 3 rates of nitrogen (75, 150, and 225 lbs/acre) were applied as liquid urea. Plots were 4-row plots, 20 feet center to center. Hand harvest was completed on September 25, 2023 and mechanical harvesting was done on October 23, 2023.
The field was planted on May 9, 2023. A single rate of nitrogen (150 lbs/acre) was applied as liquid urea. Plots were 2-row plots. Hand harvest was completed on September 30, 2023. Plots were 10 feet center to center with 2.5 ft alleys.
The field coordinates were 41.671045°, -95.945240°. The row spacing was 30” and the previous crop was soybean. A single nitrogen rate (160 lbs/acre) was applied with NH3. Both inbred and hybrid fields were planted May 2, 2023. Hand harvest for both fields was completed on September 17, 2023.
Plots were 20 feet center to center, with 2.5 ft alleys, and 4 rows. Mechanical harvest was completed on September 25, 2023.
Plots were 10 feet center to center, with 2.5 ft alleys, and 2 rows.
The field coordinates were 42.014654°, -93.728797°. The previous crop was soybeans. Both inbreds and hybrids were planted on May 19, 2023. 3 rates of nitrogen (75, 150 and 225 lbs/acre) were applied with 32% UAN. Row spacing was 30”. In the map, hybrids are located between the sets of inbreds.
Plots were 4 row plots, 20 feet center to center with 2.5 ft alleys. Hand harvest was completed on October 15, 2023. Mechanical harvest was completed on October 19, 2023.
Plots were 2 row plots, 10 feet center to center with 2.5 ft alleys. Hand harvest was completed between October 17 and November 16, 2023.
The field coordinates were 41.194394°, -91.478950°. The previous crop was soybeans. Both inbreds and hybrids were planted on May 4, 2023. 3 rates of nitrogen (75, 150 and 225 lbs/acre) were applied with 32% UAN. Row spacing was 30”. In the map, hybrids are located between the sets of inbreds.
Plots were 4 row plots, 20 feet center to center with 2.5 ft alleys. Hand harvest was completed on between September 29 and October 1, 2023. Mechanical harvest was completed on October 2, 2023.
Plots were 2 row plots, 10 feet center to center with 2.5 ft alleys. Hand harvest was completed on between September 29 and October 1, 2023.
Upon request, the following data is available:
This is the length in centimeters of the ear leaf from the stalk to leaf tip of one plant from the plot. It is abbreviated as ‘leaf_len1’. This information for the Lincoln location was taken from the sheet ‘Combined Dataset’ in the file ‘Summary of Lincoln Hybrid HIPS 2022 Data.xlsx’.
This is the width in centimeters of the midpoint of the ear leaf of one plant from the plot. It is abbreviated as ‘leafWidth1’. This information for the Lincoln location was taken from the sheet ‘Combined Dataset’ in the file ‘Summary of Lincoln Hybrid HIPS 2022 Data.xlsx’.
This is the order the ears were phenotyped. It is abbreviated in the code as ‘earNum’, but does not appear in the final dataset as the ear data was transformed to have four columns each for ear width, kernel fill length, kernel row number, kernels per row, ear weight, seed color, total kernel count, cob length, cob width, cob weight, and 100 kernel weight. These variables with the suffix ‘1’ correspond to ears with an ear number of 1 in the dataset, and this convention is maintained for the variable suffixes 2-4. All ears with an ear number greater than 4 were dropped to create a balanced dataset.
This is the weight of one ear from the plot in grams prior to shelling (i.e. with the kernels attached to the cob). This corresponds to the ear with an ear number of 1 from the original dataset. It is abbreviated as ‘earWt1’. This information for Lincoln hybrids, Missouri Valley hybrids and inbreds, and some of the North Platte location, was taken from the file ‘2022_Hybrid HIPS - Post Harvest Data - Prototype File.csv’. Information for the remaining ears from North Platte and the Scottsbluff hybrids were taken from the file ‘NP-SB_2022’.
This is the number of ears that fell to the ground in the middle two rows of the plot for hybrids. It is abbreviated as ‘earDropNum’. This information for North Platte was taken from the sheets ‘No Irr Data’, ‘Reduced Irr Data’, and ‘Full Data’ in the file ‘2022_Schnable_HIPS_data_v4.xlsx’.
This is the number of standing plants in one row of the plot. It is abbreviated as ‘standCt1’. This information for Missouri Valley was taken from the sheets ‘RawData (4-Row)’ and ‘RawData (2-Row)’ for hybrids and inbreds, respectively, in the file ‘YTMC_Lisa_Plot_Coordinates_v4.xlsx’. This information for North Platte was taken from the sheets ‘No Irr Data’, ‘Reduced Irr Data’, and ‘Full Data’ in the file ‘2022_Schnable_HIPS_data_v4.xlsx’.
This is the number of standing plants in a second row of the plot. It is abbreviated as ‘standCt2’. This information for Missouri Valley was taken from the sheets ‘RawData (4-Row)’ and ‘RawData (2-Row)’ for hybrids and inbreds, respectively, in the file ‘YTMC_Lisa_Plot_Coordinates_v4.xlsx’. This information for North Platte was taken from the sheets ‘No Irr Data’, ‘Reduced Irr Data’, and ‘Full Data’ in the file ‘2022_Schnable_HIPS_data_v4.xlsx’.
This is the historical flowering time, in days, of the inbred and was used to for blocking the inbreds. It is abbreviated as ‘histFT’. This information was taken from the sheet ‘RawData (2-Row)’ in the file ‘YTMC_Lisa_Plot_Coordinates_v4.xlsx’.
This is the historical plant height, in centimeters, of the inbred, and was used for blocking the inbreds. It is abbreviated as ‘histPlantHt’. This information was taken from the sheet ‘RawData (2-Row)’ in the file ‘YTMC_Lisa_Plot_Coordinates_v4.xlsx’.
This is the block the inbred was placed in based on historical plant height and days to flowering. It is abbreviated as ‘block’. This information was taken from the sheet ‘RawData (2-Row)’ in the file ‘YTMC_Lisa_Plot_Coordinates_v4.xlsx’.
This is the genotypic population the plants grown in the plot are from, either the Hybrid HIPS population, abbreviated as ‘Hybrid’, or the Inbred HIPS population, also known as the SAM population, abbreviated as ‘Inbred’. It is abbreviated as ‘population’.
This is the mean length of the ear prior to shelling (i.e. with the kernels on the ear) of one ear from the plot. It is abbreviated as ‘earLen’. This information for the Ames and Crawfordsville locations was taken from the files in the folder ‘3 Ear Traits Station’ and converted from millimeters to centimeters. For these two locations, when the ear had severe bending, a string was used to measure the length and this is denoted in the notes field. This data was not collected at the North Platte, Scottsbluff, Lincoln, and Missouri Valley locations.
This is the height in centimeters of the plant, including the tassel, of one plant from the plot (in the case of Scottsbluff) or the average of this measurement for two plants from the plot in Lincoln. All measurements from plants noted as stunted or without a silk were marked as missing data prior to calculation. It is abbreviated as ‘tasselTipHt’. This information for the Lincoln location hybrids was taken from the sheet ‘Combined Dataset’ in the file ‘Summary of Lincoln Hybrid HIPS 2022 Data.xlsx’. For Scottsbluff hybrids, this data was taken from the sheet ‘Hybrid_height data’ from the file ‘Corn_data_Scottsbluff-2022_rk_11.11.2022’ and converted from inches. Plot 1322 at the Scottsbluff location was dropped due to having a value more than 114 centimeters (roughly 3.7 feet) greater than all other observations.
This is whether or not the kernels from the plot exhibited striping. It is abbreviated as ‘kernelStriping’. This information for hybrids and Missouri Valley inbreds was taken from the file ‘plotleveleardata_v2.csv’. This information was not collected for the Ames and Crawfordsville locations.
This is the percent of plants that lodged due to stalk breakage in the middle two rows of the plot for hybrids and in the whole plot for inbreds. It is abbreviated as ‘pctStalkLodge’. This information for Missouri Valley, Crawfordsville, and Ames was taken from the sheets ‘RawData (4-Row)’ and ‘RawData (2-Row)’ for hybrids and inbreds, respectively, in the file ‘YTMC_Lisa_Plot_Coordinates_v4.xlsx’. This information for North Platte was taken from the sheets ‘No Irr Data’, ‘Reduced Irr Data’, and ‘Full Data’ in the file ‘2022_Schnable_HIPS_data_v4.xlsx’.
This is the percent of plants that lodged due to insufficient roots in the middle two rows of the plot for hybrids and in the whole plot for inbreds. It is abbreviated as ‘pctStalkLodge’. This information for Missouri Valley, Crawfordsville, and Ames was taken from the sheets ‘RawData (4-Row)’ and ‘RawData (2-Row)’ for hybrids and inbreds, respectively, in the file ‘YTMC_Lisa_Plot_Coordinates_v4.xlsx’. This information for North Platte was taken from the sheets ‘No Irr Data’, ‘Reduced Irr Data’, and ‘Full Data’ in the file ‘2022_Schnable_HIPS_data_v4.xlsx’.
This is the latitude of the plot. This information for Missouri Valley, Crawfordsville, and Ames was taken from the sheets ‘RawData (4-Row)’ and ‘RawData (2-Row)’ for hybrids and inbreds, respectively, in the file ‘YTMC_Lisa_Plot_Coordinates_v4.xlsx’.
This is the longitude of the plot. This information for Missouri Valley, Crawfordsville, and Ames was taken from the sheets ‘RawData (4-Row)’ and ‘RawData (2-Row)’ for hybrids and inbreds, respectively, in the file ‘YTMC_Lisa_Plot_Coordinates_v4.xlsx’.